Skip to content

Conversation

@pthmas
Copy link
Contributor

@pthmas pthmas commented Jan 16, 2026

Overview

  • Adds a height-based pruning mechanism to the ev-node store:
    • Prunes /h/{height}, /d/{height}, /c/{height}, /i/{hash} up to a target height.
    • Also prunes per-height DA metadata (height → DA height mapping) for pruned heights.
    • Leaves /s/{height} (state), /t (current height), and global metadata keys untouched.
    • Tracks progress via a last-pruned-block-height metadata key so pruning is monotonic and idempotent.
  • Integrates pruning into block production:
    • After a block is committed, at configurable intervals, the executor computes a prune target and calls store.PruneBlocks synchronously.
  • Wires pruning into the go-header store:
    • Sets the go-header pruning window to PruningKeepRecent * BlockTime so header/data retention matches ev-node’s window.
  • Adds pruning for EVM execution metadata:
    • EVMStore exposes PruneExecMeta to delete per-height ExecMeta entries up to a target height.
    • The EVM executor implements an optional ExecMetaPruner interface, and the block executor calls it from the same pruning hook.
    • Non-EVM (e.g. ABCI) executors don’t implement this interface, so they are unaffected.

Config

New node config fields / flags:

  • --evnode.node.pruning_enabled
  • --evnode.node.pruning_keep_recent
  • --evnode.node.pruning_interval

Pruning actually runs only when all three are set to non-trivial values (enabled, keep_recent > 0, interval > 0).

Design trade-offs

  • Runs in the critical path:
    Pruning is called synchronously in produceBlock. On pruning heights, block production will incur extra latency proportional to the amount of data pruned and underlying storage performance.

  • Best-effort pruning:
    If pruning fails, we log the error but do not fail the block. This avoids halting the node but means disk usage can grow silently if pruning keeps failing; operators will need to watch logs/metrics.

  • Scope:
    This PR prunes ev-node’s own store, the go-header store, and EVM ExecMeta. Pruning of ABCI state (ev-abci) is out of scope and will be added in a follow-up once ev-abci exposes its own pruning API.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 16, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedJan 27, 2026, 4:21 PM

@claude
Copy link
Contributor

claude bot commented Jan 16, 2026

Claude finished @pthmas's task in 2m 23s —— View job


Comprehensive Code Review Complete ✅

I've completed a thorough review of the block pruning implementation across all 16 changed files. Overall, this is a well-designed and carefully implemented feature with excellent test coverage and proper separation of concerns. Below is my detailed analysis across multiple dimensions.


1. Code Quality & Architecture ⭐⭐⭐⭐½

Strengths:

  • Clean separation of concerns: Pruning logic is properly separated into distinct layers (store, execution, block executor)
  • Idiomatic Go style: Consistent error wrapping, proper use of contexts, clean interfaces
  • Good abstraction design: Pruner interface in store/types.go:85-94 is well-defined and composable
  • Modular pruning strategy: EVM execution metadata pruning is cleanly separated via optional ExecMetaPruner interface (core/execution/execution.go:165-177)

Minor Issues:

pkg/store/store.go:277-368 - PruneBlocks implementation:

  • Line 340-348: The DA metadata pruning logic could be extracted into a helper function to reduce code repetition
    // Suggestion: Extract DA metadata cleanup
    func (s *DefaultStore) pruneDaMetadata(ctx context.Context, batch ds.Batch, h uint64) error {
        hDaKey := getMetaKey(GetHeightToDAHeightHeaderKey(h))
        dDaKey := getMetaKey(GetHeightToDAHeightDataKey(h))
        for _, key := range []string{hDaKey, dDaKey} {
            if err := batch.Delete(ctx, ds.NewKey(key)); err != nil {
                if !errors.Is(err, ds.ErrNotFound) {
                    return fmt.Errorf("failed to delete DA metadata at height %d: %w", h, err)
                }
            }
        }
        return nil
    }

block/internal/executing/executor.go:451-473 - Pruning integration:

  • Line 451: Hardcoded check e.config.Node.PruningEnabled && e.config.Node.PruningKeepRecent > 0 && e.config.Node.PruningInterval > 0 is repeated. Consider extracting to a method e.shouldPrune() for better maintainability
  • Line 466-470: The optional ExecMetaPruner check is clean, but consider logging when pruning is skipped for non-EVM executors to aid debugging

execution/evm/store.go:148-194 - ExecMeta pruning:

  • Line 159: Using len(data) == 8 magic number check. Consider extracting constant const heightEncodingLength = 8
  • Line 173-180: Batch delete loop could benefit from tracking and logging statistics (e.g., "pruned N ExecMeta entries")

2. Error Handling & Resilience ⭐⭐⭐⭐⭐

Excellent Practices:

  • Best-effort pruning with graceful degradation: executor.go:458 logs errors but doesn't fail block production—correct approach for non-critical maintenance
  • Idempotent pruning with progress tracking: Both store.go:290-306 and evm/store.go:152-166 properly track last-pruned-height to avoid redundant work
  • Proper error wrapping: Consistent use of fmt.Errorf("... %w", err) throughout for error context
  • ErrNotFound tolerance: store.go:314-316, 321-324, 327-330, 333-336, 340-348, 351-356 correctly handles missing entries during pruning

Suggestions:

  • pkg/store/store.go:458: Consider adding metrics for pruning failures to enable monitoring/alerting in production
  • block/internal/executing/executor.go:458: The error log could include more context (e.g., height range attempted, error count) for operational debugging

3. Security ⭐⭐⭐⭐⭐

Analysis:

  • No privilege escalation risks: Pruning only deletes data, doesn't modify or access sensitive information
  • Proper input validation: config/config.go:317-328 validates pruning configuration before accepting it
  • State integrity preserved: Pruning explicitly avoids /s/{height} (state snapshots) and /t (current height) keys - critical for consensus safety
  • Atomic batch operations: All deletions use batched writes with commit at the end (store.go:284-367, evm/store.go:168-191)
  • No injection risks: All keys are programmatically constructed, not user-supplied

Observation:

The design correctly prioritizes consensus safety over disk usage by:

  1. Never pruning state snapshots (needed for rollback/sync)
  2. Only pruning historical block data (header/data/signature/index)
  3. Making pruning failures non-fatal to block production

4. Performance & Resource Efficiency ⭐⭐⭐⭐

Design Trade-offs (Acknowledged in PR):

Critical Path Execution (executor.go:447-473):

  • Acceptable: Pruning runs synchronously in produceBlock after commit
  • ⚠️ Consideration: On intervals where pruning triggers (e.g., every 1000 blocks with 1000 block retention), there will be a latency spike proportional to:
    • Number of heights to prune (controlled by PruningInterval)
    • Underlying storage performance (badgerDB, levelDB, etc.)
    • Amount of data per height

Mitigation Recommendations:

  1. Operators should tune parameters carefully:

    # Example conservative config
    pruning_interval: 100        # Prune more frequently
    pruning_keep_recent: 10000   # But keep more history
    # This spreads pruning cost over more block production cycles
  2. Consider async pruning in future (out of scope for this PR):

    // Future enhancement: launch pruning in background goroutine
    go func() {
        if err := e.store.PruneBlocks(ctx, targetHeight); err != nil {
            e.logger.Error().Err(err).Msg("background pruning failed")
        }
    }()

Memory Efficiency:

  • ✅ Batch operations minimize write amplification
  • ✅ No unbounded allocations in pruning loops
  • ✅ Progress tracking prevents redundant work on retry

5. Testing & Reliability ⭐⭐⭐⭐⭐

Excellent Coverage:

pkg/store/store_test.go:737-814 - PruneBlocks tests:

  • TestPruneBlocks_RemovesOldBlockDataOnly: Verifies selective pruning (removes headers/data/sigs/index, preserves state)
  • ✅ Validates DA metadata cleanup (lines 795-813)
  • ✅ Tests idempotency via last-pruned-height tracking

execution/evm/store_test.go:20-99 - ExecMeta pruning tests:

  • TestPruneExecMeta_PrunesUpToTargetHeight: Basic functionality
  • TestPruneExecMeta_TracksLastPrunedHeight: Progress tracking and resumability (lines 62-99)
  • ✅ Proper test isolation with in-memory datastores

Missing Test Cases:

  1. Pruning failure handling: No test simulating batch commit failure during pruning (should verify error propagation)
  2. Concurrent pruning: No test for calling PruneBlocks concurrently (though not expected in normal operation)
  3. Edge case: Pruning when PruningKeepRecent equals current height (boundary condition)
  4. EVM store: No test for pruning with missing ExecMeta entries (though implementation handles this)

Test Quality Observation:

The existing rollback tests (store_test.go:661-735, 817-1274) serve as good reference models. The pruning tests follow similar patterns effectively.


6. Documentation & Developer Experience ⭐⭐⭐⭐

Configuration Documentation:

pkg/config/config.go:239-244:

// Pruning configuration
PruningEnabled    bool   `comment:"Enable height-based pruning of stored block data..."`
PruningKeepRecent uint64 `comment:"Number of most recent blocks to retain..."`
PruningInterval   uint64 `comment:"Run pruning every N blocks..."`
  • ✅ Clear comments explaining each parameter
  • ✅ Validation logic with helpful error messages (lines 317-328)
  • ✅ Flags properly registered in AddFlags (lines 391-393)

Code Comments:

pkg/store/store.go:277-283:

// PruneBlocks removes block data (header, data, signature, and hash index)
// up to and including the given height from the store. It does not modify
// the current chain height or any state snapshots.
  • ✅ Clear description of what is pruned
  • ✅ Explicitly states what is NOT pruned (critical for safety)
  • ⚠️ Could add example: "E.g., PruneBlocks(ctx, 100) removes heights 1-100"

Suggestions:

  1. Add operational notes to config comments:

    PruningInterval uint64 `comment:"Run pruning every N blocks. Lower values spread pruning cost but increase overhead. Recommended: 100-1000."`
  2. Document go-header store pruning (sync_service.go:86-97):

    • The PR description mentions setting go-header pruning window but there's no visible code change for this in the diff. This might be in a different file or handled implicitly. Consider adding a comment explaining the relationship.

7. Architecture Observations

Design Highlights:

1. Clean Layered Pruning (3 layers working in concert):

Block Executor (block/internal/executing/executor.go:447-473)
    ↓ orchestrates
Store Pruning (pkg/store/store.go:277-368)
    ↓ ev-node block data
Execution Pruning (execution/evm/store.go:148-194)
    ↓ EVM-specific ExecMeta

2. Optional Interface Pattern for EVM pruning:

// core/execution/execution.go:165-177
type ExecMetaPruner interface {
    PruneExecMeta(ctx context.Context, height uint64) error
}

// executor.go:466
if pruner, ok := e.exec.(coreexecutor.ExecMetaPruner); ok {
    pruner.PruneExecMeta(e.ctx, targetHeight)
}

This is excellent design because:

  • ✅ Zero impact on non-EVM executors (ABCI, etc.)
  • ✅ Type-safe optional behavior via interface assertion
  • ✅ Extensible for future executor types

3. Metadata-Driven Progress Tracking:
Both stores use the same pattern to track pruning progress:

  • pkg/store/keys.go:28-32: LastPrunedBlockHeightKey
  • execution/evm/store.go:18-21: lastPrunedExecMetaKey

This enables:

  • ✅ Resumable pruning across restarts
  • ✅ Idempotent operations (safe to retry)
  • ✅ Efficient range queries (lastPruned + 1 to height)

8. Specific Code Issues & Recommendations

Issue 1: Inconsistent Height Encoding

Observation: Two different endianness conventions:

  • pkg/store/store.go:373-375: Uses binary.LittleEndian for block heights
  • execution/evm/store.go:104, 160, 184: Uses binary.BigEndian for ExecMeta heights

Impact: Low (keys are in separate namespaces), but could cause confusion.

Recommendation: Document the rationale in a comment or standardize on one convention project-wide.


Issue 2: Hardcoded Pruning Logic

Location: block/internal/executing/executor.go:451-473

Current:

if e.config.Node.PruningEnabled && e.config.Node.PruningKeepRecent > 0 && e.config.Node.PruningInterval > 0 {
    if newHeight%e.config.Node.PruningInterval == 0 {
        // ... pruning logic ...
    }
}

Recommendation: Extract to method for testability:

func (e *Executor) shouldPruneAtHeight(height uint64) (bool, uint64) {
    if !e.config.Node.PruningEnabled || 
       e.config.Node.PruningKeepRecent == 0 || 
       e.config.Node.PruningInterval == 0 {
        return false, 0
    }
    
    if height%e.config.Node.PruningInterval != 0 {
        return false, 0
    }
    
    if height <= e.config.Node.PruningKeepRecent {
        return false, 0
    }
    
    return true, height - e.config.Node.PruningKeepRecent
}

// In produceBlock:
if shouldPrune, targetHeight := e.shouldPruneAtHeight(newHeight); shouldPrune {
    e.store.PruneBlocks(e.ctx, targetHeight)
    // ...
}

Issue 3: Missing Metrics

Location: block/internal/executing/executor.go:457-470

Recommendation: Add pruning metrics for observability:

// In metrics.go:
PruningDuration *prometheus.HistogramVec // labels: store={block,execmeta}, status={success,failure}
PruningHeightsPruned prometheus.Counter

// In executor.go:
start := time.Now()
err := e.store.PruneBlocks(e.ctx, targetHeight)
if err != nil {
    e.metrics.PruningDuration.WithLabelValues("block", "failure").Observe(time.Since(start).Seconds())
    e.logger.Error().Err(err).Msg("failed to prune old block data")
} else {
    e.metrics.PruningDuration.WithLabelValues("block", "success").Observe(time.Since(start).Seconds())
    e.metrics.PruningHeightsPruned.Add(float64(targetHeight - lastPruned))
}

9. Testing Recommendations

High Priority:

  1. Add error injection test for batch commit failure:

    func TestPruneBlocks_BatchCommitFailure(t *testing.T) {
        // Mock store that fails on batch.Commit()
        // Verify error is propagated correctly
        // Verify last-pruned-height is NOT updated on failure
    }
  2. Add integration test for concurrent pruning (even if just to verify it's safe):

    func TestPruneBlocks_Concurrent(t *testing.T) {
        // Launch 2 goroutines calling PruneBlocks with same height
        // Verify both complete without panic/corruption
    }

Medium Priority:

  1. Add test for pruning at exact PruningKeepRecent boundary:
    func TestPruneBlocks_AtBoundary(t *testing.T) {
        // When height == PruningKeepRecent, verify nothing is pruned
        // When height == PruningKeepRecent + 1, verify exactly height 1 is pruned
    }

10. Security Considerations

Reviewed Attack Vectors:

  1. Malicious config values: ✅ Validated in config.go:317-328
  2. State corruption via selective pruning: ✅ Mitigated by preserving state snapshots
  3. Disk fill DoS via failed pruning: ⚠️ Needs operator monitoring (metrics would help)
  4. Consensus desync: ✅ Pruning only affects local storage, not consensus

Recommendation:

Add health check endpoint that exposes pruning status:

GET /health
{
  "pruning": {
    "enabled": true,
    "last_pruned_height": 9500,
    "keep_recent": 500,
    "disk_savings_gb": 42.3  // estimated
  }
}

Summary & Verdict

Overall Assessment: Approved with Minor Suggestions

This is a production-ready implementation with:

  • ✅ Solid architecture and clean separation of concerns
  • ✅ Comprehensive error handling and idempotency
  • ✅ Excellent test coverage for core functionality
  • ✅ Clear configuration validation and documentation
  • ✅ Proper attention to consensus safety

Before Merging:

Required:

  • None (implementation is safe to merge)

Recommended:

  1. Add metrics for pruning operations (high value for production observability)
  2. Extract shouldPruneAtHeight() helper for better testability
  3. Add missing edge case tests (concurrent pruning, batch failure)
  4. Consider documenting disk space savings formula in config comments

Post-Merge Follow-ups:

  1. Monitor production impact: Track block production latency on pruning intervals
  2. Tune default parameters: Current defaults may need adjustment based on real workload
  3. Async pruning enhancement: Consider moving pruning to background goroutine if latency spikes are observed
  4. Add ABCI pruning: Per PR description, ev-abci pruning is planned for a follow-up

Great work on this feature! The implementation demonstrates strong engineering practices and attention to detail. The design choices (synchronous pruning, best-effort semantics, optional interfaces) are well-reasoned and documented.

Copy link
Member

@julienrbrt julienrbrt Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we have a lot of different stores and keys. Are we planning to prune as well:

(and eventually ev-abci store)

It will be inconsistent if we don't allow height pruning of all those stores (at once).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this PR to prune metadats, go-header store and evm store ✅
I'm happy to work on prunning ev-abci next, but I would probably do it in another PR to prevent this PR from getting too big if that's ok.

@pthmas pthmas changed the title Block Prunning feat: block Prunning Jan 26, 2026
@pthmas pthmas changed the title feat: block Prunning feat: block Pruning Jan 26, 2026
@pthmas pthmas force-pushed the pierrick/prunning branch from 3e1e8e0 to ed66fe6 Compare January 26, 2026 15:01
cmd.Flags().Duration(FlagScrapeInterval, def.Node.ScrapeInterval.Duration, "interval at which the reaper polls the execution layer for new transactions")
// Pruning configuration flags
cmd.Flags().Bool(FlagPrefixEvnode+"node.pruning_enabled", def.Node.PruningEnabled, "enable height-based pruning of stored block data (headers, data, signatures, index)")
cmd.Flags().Uint64(FlagPrefixEvnode+"node.pruning_keep_recent", def.Node.PruningKeepRecent, "number of most recent blocks to retain when pruning is enabled (0 = keep all)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(0 = keep all) does not look correct in relation with the validation before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants